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Abstract 
Moment  type  estimation  of  characteristics  of  a  successively  sampled 
finite  population  requires  finding  a  solution  to  a  pair  of  transcendental 
equations.   Some  global  properties  of  two  structurally  distinct  pairs  — 
a  symmetric  and  an  asymmetric  pair  —  of  such  equations  are  presented. 
Results  of  a  monte  carlo  experiment  designed  to  compare  performance  of 
estimators  computed  by  solving  these  transcendental  equation  pairs  are 
reported. 


1.   INTRODUCTION 

Successive  sampling  models  have  recently  been  used  to  characterize 
random  phenomena  as  diverse  as  oil  and  gas  field  discovery  and  the 
occurrence  of  software  bugs  (Barouch  and  Kaufman  (1976),  Littlewood,  B.  (1981), 
Gordon  (1981,  1983),  Andreatta  and  Kaufman  (1983)),  and  the  draft  lottery 
(Du  Mouchel  (1970)).   What  distinguishes  these  applications  from  the  usual 
treatment  of  successive  sampling  in  the  sample  survey  literature  is  the 
absence  of  information  about  the  sample  frame  which  allows  a  priori  computation 
of  the  probability  that  a  generic  element  of  a  finite  population  with  N 
elements  will  be  included  in  a  sample  of  arbitrary  size  n  <  N. 

Cordon  (1983)  has  suggested  use  of  a  moment-like  estimator  for 
parameters  of  a  successively  sampled  finite  population.   The  basic  idea 
is  to  split  the  sample  into  two  parts  and  then  set  approximate  Horvitz- 
Thompson  type  estimators  equal  to  one  another  in  order  to  calculate 
approximations  of  inclusion  probabilities.   Two  moment  matching  alternatives 
are  outlined,  each  of  which  implies  a  pair  of  transcendental  equations.   The 
equation  pair  studied  in  detail  by  Gordon  (1983)  is  symmetric  in  form,  while 
the  alternative  is  not.   He  shows  that  with  probability  one  the  symmetric 
pair  possesses  a  unique  solution  in  the  asymptotic  limit  N  ->-  «>  with  n/N 
fixed. 


\>niile  intuitively  one  might  expect  the  two  alternative  formulations 
to  be  asymptotically  equivalent,  their  behavior  for  finite  samples  is  at 
issue:   In  particular  when  N  is  finite,  one  questions  the  existence 
and  uniqueness  of  solutions  of  each  pair.   What  are  the  finite  sample 
properties  of  an  estimate  of  the  number  of  elements  in  the  population 
or  of  an  estimate  of  the  sum  of  magnitudes  of  population  elements  generated 
by  these  two  alternatives? 

It  is  the  purpose  of  this  paper  to  establish  a  sufficient  condition 
for  existence  of  a  solution  of  the  asymmetric  pair  when  sample  size  is 
finite  and  to  compare  properties  of  estimators  of  population  attributes 
based  on  solutions  to  both  pairs. 


1.1   SUCCESSIVE  SAMPLING 


Consider  a  finite  population  of  N  elements  with  labels  U  =  {1,2,...,N} 

and  associated  magnitudes  A  =  (A  ,  .  .  .  ,A,^^} ;  i.e.  the  magnitude  of  element 

j  £  U  is  A.  >  0,  j  =  1,2,...,N.   A  successive  sampling  scheme  induces  a 

distribution  on  permutations  of  elements  of  U  as  follows:   for  any 

permutation  (i  , ...,i  )  of  (1,2,...,N), 

N 
P{(i^,...,i^)lA}   =   n  A   /(A.  +  ...  +  A  ).  (1.1) 

1     ^         j  =  l   j    ^j  ^ 

An  alternative  representation  of  (1.1)  can  be  given  in  terms  of 

exponential  order  statistics:   let  x  , ...,x^  be  independent,  identically 

distributed  exponential  random  variables  with  means  equal  to  one.   Then 

(Gordon  (1983)) 

X.         X.  X. 

p{(i      i  )|A}   =  P{^  <  -^   <  .  .  .  <  — }.  (1-2) 

h       ^2  "n 

This  latter  representation  is  an  analytical  lever  for  generation  of  moment- 
type  estimators  of  unobserved  population  magnitudes  when  sampling  is  incomplete. 

Suppose  that  we  observe  an  unordered  sample  s   =  {i^,...,i  } 
^'^  n     1      n 

consisting  of  the  first  n  elements  of  (i  ,...,i  ),  n<N,  and  that  s   is 

the  only  information  available  about  the  sampling  scheme.   How  might  we 

use  s   to  estimate  properties  of  A?   In  particular,  we  may  wish  to  estimate 

N 
R   =    E   A.,  the  total  sum  of  magnitudes  in  A,  the  number  N  of  elements 

in  U,  or  the  empirical  frequency  function  of  magnitudes  in  A. 


1.2   MOMENT  MATCHING  ESTIM^\TORS 

Gordon  (1983)  presents  a  moment  type  method  for  estimating  finite 

population  properties.   His  method  rests  on  three  key  ideas:   first,  if 

one  had  access  to  all  elements  of  A,  then  it  is  possible  to  compute  the 

probability  P{k.£s}   i   ^i.^'^)  that  element  k  U  appears  in  a  sample  s   of 

size  n.   Let  g(A^)  be  a  given  function.   Armed  with  tt  (n)  ,  k  =  1,2,...,N, 

N 
an  unbiased  estimator  of  the  sum  E   g(A.),  is   Z   g(A^)/TT  (n),  an  estimator 

j=l  kes 

introduced  by  Horvitz  and  Thompson  (1952)  .   If  all  elements  of  A  were  known 

with  certainty  a  priori  there  would  be  no  estimation  problem. 

The  second  key  idea  is  that  when  N  is  large  and  n/N  =  f   is  fixed, 

N 
there  exists  a  unique  solution  t  .  to  N-n   =    Z   exp{-tA  }  for  which 

■^  k=l        ^ 

|1  -  exp{-t  A)  -  ^,  (n)  I  =  0(N~''')  (Gordon  (1983),  Theorem  2.2).   Thus,  if 

A  ,  .  .  .  ,A,,^^  were  known,  t^  could  be  computed  and  1  -  exp{-t  Ji  }  would  closely 

approximate  tt  (n)  .   Once  t^  is  obtained  one  can  replace  ''f,  (n)  with 

1  -  exp{-t^A,  }  to  obtain  an  approximately  unbiased  Horwitz-Thompson  estimator. 

Again,  the  hitch  is  that,  given  a  sample  s  ,  only  A.  ,.,.,A.   are  observed 

1       n 
and  t  ,  depends  on  all  elements  of  A. 

To  overcome  these  difficulties  Gordon  proposes  a  third  idea.   If  the 

complete  sample  is  s  ,  split  it  into  an  "early"  part  s   consisting  of  the 

first  m<n  observations  and  a  "late"  part  consisting  of  the  remaining  n-m 

observations.   In  order  to  simplify  notation  and  with  no  loss  in  generality, 

relabel  elements  of  s   so  that  s   =  {l,2,...,n}  and  s   =  {l,2,.,.,m}. 

n  n  m 


N 
Define   h   =  ra/n,    and    let    t,     solve   N-m  =      Z      exp    {-CA,  }.      Then 

^  k=l         ^ 

R^(6)   =    Z   A^/(l  -  exp{-t^A^})  and  R  (6)   =    I   \/(l  -  exp{l  -  exp{-t  A  1) 

N    5 
are  approximately  unbiased  estimators  of  the  characteristic  R(6)  =   E   A,  , 

0   <_  <5   <_  1.   For  two  distinct  choices  of  5,  5   and  5„,  R^(5  )  =  R^(5  ) 

d  R^(6„)  =  R^(6„)  constitute  two  equations  in  two  unknowns,  t,  and  t,. 


an 


In  particular  for  the  case  6=6  and  6=1,  one  obtains  the 

symmetric  pair  of  equations  (with  t   =  ct,t^  =  a+3)  . 

k      J 

n       A,  m      A, 

(1. 3a) 


1  -  e  -^    1  -  e 


and 


n       A,  m      A 

k=i^     -^■^^^^\  ^  jii^     ^' 

1  -  e  -^    1  -  e 

A  different  estimator  arises  if  one  utilizes  s   and  the  "late" 

n 

portion  of  s   in  the  following  fashion:   The  "late"  part  (A  .,,..., A  1  H  s  i 
n  ^      m+1      n     n|m 

of  s   is  generated  by  successively  sampling  {A    ,...,A^}.   Define  5,  =  (n-ra)/N 

N  n 

and  let  t   be  a  solution  to    E    exp  {-t  A^}  =  N-n.   Then    Z    A. / (l-exp{-t  A. }) 
^  k=Tirfl  j=nrfl   ^  ^ 

m 
estimates  R  -   Z   A..   By  the  same  logic  that  led  to  (1.3a)  and  (1.3b),  with 


=  t  we  set 


m     A.  m         n      A. 

Z  13-—  =    Z   A.  +    T        ^3—-  .  (1.3c) 

j  =  l  ,       ^j      j  =  l   J    k=n^l  ,       J- 
-"l-e         -^  1-e-' 

Together  with  (1.3a),  (1.3c)  is  a  pair  of  equations  that  can  be  solved  for 
a  and  S.   We  call  (1.3a)  and  (1.3c)  the  asymmetric  pair,  and  begin  our 
analysis  with  a  study  of  properties  of  this  pair. 


2.   PROPERTIES  OF  THE  ASYMMETRIC  PAIR 


For  easy  reference  we  restate  the  asyminetric  pair  as 


n  m 

Z  A./(l-exp  {-(a+6)A.})  -   I   A. /I-  exp  {-aA.})  (2.1a) 

j=l  J  J      j=l  J  J 


and 


m  m        n 

E  A. /(I-  exp  {-aA.})=   I  A  +   I   A./(l-exp  {-6A.}).       (2.1b) 
j=l  J  J     j  =  l  J    j=:iH-l   ^  J 

Our  first  task  is  to  find  conditions  that  guarantee  existence  of  • 

a  solution.   Consider  first  the  existence  of  a  solution  when  3-»-°°.  As  B -*•<», 

(2.1a)  and  (2.1b)  are  redundant;  i.e.  both  take  the  form 


(2.2) 


n 

m 

E      A. 

=        ^      A   /(I 

-  exp    {-aA. }  ) 

j=l      J 

j=l      J 

J 

Lenma  1:   Define  the  function  g  of  a  as 

n        m 
g(a)   =   I   A.  -   E  A. /(I  -exp  {-aA.}),        a>0  ,  (2.3) 

j  =  l  J    j=l  J  J 

For  A.,  ...,  A   finite  and  positive,  a  solution  a   >  0  of  g(a)  =  0  exists 
and  is  unique. 

Proof ;   (Compare  with  Gordon  (1983),  Lemma  5.8(a))  With  g  as  defined 

in  (2.3), 

n 
(i)   lim  g(a)   =    E   A.  >  0. 
ct  ->■  CO  j=m^-l  •' 

(ii)   g(a)  is  continuous  for  0  <  a  <  ■=. 

Ill  3 

(iii)   For  a  small,  1  -  exp  {-aA.}   =  aA.  -  -r-  a  A.  +  0(a  ),  so 


n      ^  m  n       .  m 

g(a)  =  Z      A   --   I J     =    L   A.  -  -  i:   (1  +  4  ^A.  +  O(ci^)) 

j=l  ^    ''  j=l(l  -  i  aA.  +  0(a)^)     j  =  l  ^    ^  j=l      ^   J 

whereupon 

n       ^   m 
g(a)   =  --+(    I     A.  -  y  Z     A.)  +  0(a).  (2. A) 

""  2  =  1     ^         ^   3=1     J 

(iv)   For  a  >  0  and   small,  (iii) implies  g(a)  <  0. 
The  function  g(a)  is  positive  for  large  a,  negative  for  small  a 
and  continuous.   Thus  a  zero  a  of  g(a)  =  0  must  exist. 

o   -aA. 

.       m    A.  e   -^  ■ 

(v)    As  -r°-  =    Z  ^ ■ >   0,  3g/3a  at  a  =  a   is  positive, 

9a     .  T       -aA.  ~  c 

^='  [1  -  e   J]2 

so  the  solution  (a  ,  <=)  is  unique.  I 

c 


Lemma  2:   No  solution  (a-.,  6„)  to  (2.1a)  and  (2.1b)  with  a^  >  a  exists. 

Proof :   First  consider  a  solution  of  the  form  (ct-^,  5),  8-^°°.   Lemma  1 

shows  that  (a  ,  °°)   with  g(a  )  =  0  is  a  unique  solution.   In  addition 
c  c 

5g/3a  >  0.   For  any  a,  g  <  °°,  the  left  side  of  (2.1a)  is  greater  than 

n 

Z      A..   Consequently,  there  can  be  no  solution  (ct^,  S_)  with  a^  >  a  .  ■ 
j=l   J 

We  next  establish  a  sufficient  condition  for  existence  of  a  solution 

(a-,  B„)  to  (2.1a)  and  (2.1b)  with  0  <  a^  <  a  . 
U   u  U    c 


1  "         -        1  "^ 

Theorem  1:   Defining  —  l     A.  =  A  and  —  Z   A.  =  A  ,  a  sufficient  condition 
n  .  ,   J    n     m  .  .   J    m' 

for  existence  of  a  solution  (a„,  6^)  with  0<ci^<a  is  A  >A.  When 

0   U  U    c     m    n 

A   >  A   the  number  of  solutions  is  odd. 
m    n 


We  set  the  stage  for  proof  of  Theorem  1  with  three  propositions  based 

on  consideration  of  two  functions:   the  Implicit  Function  Theorem  allows 

us  to  define  a  function  6.  (a)  via  (2.1a)  and  a  function  6^'(a) 

via  (2.1b).   A  solution  of  (2.1a)  and  (2.1b)  obtains  when  6, (a)  =  62(a). 

It  is  evident  that  lim  6,  (a)  =  lim  6- (a)  =  «>  as  a -^  a  from  below.   We  shall 

1  /  c 

show  that  6,  (a)  and  S»(a)  approach  zero  as  a-»0,  that  when  A  >  A  , 
8-(a)  >  S, (a)  for  a  in  a  neighborhood  of  zero  and  that  S^(a)  <  6^ (a)  for  a 
in  a  neighborhood  of  a  .   These  facts  imply  that  8, (a)  and  6-(a)  must 
cross  an  odd  number  of  times  in  the  open  interval  (0,  a  ). 


Proposition  1:   lim  6, (a)  =  lim  3^ia)    =  0- 

Proof;   Compute  a  Taylor  expansion  of  B,  (a)  and  B^i^)    for  a  >  0  and  small. 
Equation  (2.1a)  takes  the  form 

+  4   E  A.   =  -  +  T  E   A..  (2.5) 


a  +   B^(a)    2  j^^  j     a   2  .^^     j 

Since  (2.5)  possesses  a  singularity  on  both  sides,  equating  singular 
parts  yields  the  leading  term  of  the  expansion  of  6-,  (a)  •   This  is 

6t (a)  =   ( )a.   The  second  term  of  the  expansion  involves  a  constant 

1         m 

C,  such  that 

6.  (a)   =   (^^^)a[l  +  C^a], 
1  ml 


10 


so 

a  +  6,  (a)   s   (^a)[l  +  (^)C,a] 
1  m  n    X 

This  allows  us  to  write 


a  +  8^(a)    an       1 
Substitution  of  this  relation  in  the  Taylor  expansion  of  (2.1a)  yields 

1     2m(n-m)   .^^^  J 
Therefore, 


Similarly , 


n 

/  N   -   /n— m,  r  1  ,   ^   I-   A  1 

2  m        Zm  ^^  J 


Thus  6,  (a)  -^  0  and  B.Cct)  ->  0  as  ct  -^  0.  | 


Proposition  2:   For  a  >  0  and  a  -  0  and  A^  >  A^,  &^(a.)    >    S^(a)  . 

Proof:   Subtract  the  two  Taylor  expansions  of  3^(a)    and  62(a).   The 
leading  term  cancels  and 

=   ,  "   ,  [A   -  A  ]   >   0.  ■ 
2(n-m)    m    n 


11 


Proposition  3:   For  a  -  a  >  0  and  a  -  a^,  B^{a)    <   e^(a). 

Proof:   In  the  vicinity  of  a^,  B^(a)  and  62(a)  are  unbounded  and  increase 
indefinitely.   Since  a  is  finite  and  since  m  and  n  are  finite,  we  can 
neglect  a  in  comparison  with  S^(a)  in  some  terms  on  the  left  hand  side 
of  (2.1a).   The  system  (2.1a)  and  (2.1b)  takes  a  slightly  modified 


form,  valid  for  a  -  o  , 

m  A.  n        A^ 


.^^  -(a+i(a))A.    "^  ._l.i        -6,(a)A 

j=l  ,  _  ^     1     J      2=^^     1  -  e  ^    ^ 


m       A. 

I   -   1 


(2.6a) 


-aA. 
3=^1  1  _  e   J 


and 

m     A.  m         n        A  ,,    , 

z  ^ =     I    A.  +     I     J-Tvrr-  •  ^2-^^^ 

J=l  1  .  e   2  J=l      2=^^   1  -  e  ^    ^ 

Since  at  a  =  a  we  have  a  solution  and  since  the  left  side  of  (2.6a) 
c 

equals  the  right  side  (2.6b),  we  equate  them.   Doing  so  yields, 

^   1  -  ^  1  -  e         1   »         ^2  ,, 

The  first  term  is  positive.   Thus  the  second  term  is  negative 
implying  6^(a)  >  62(a)  for  a^  -  a  >  0  and  a  -  a^.  ■ 


12 


Theorem  1  states  that  if  A   >  A  ,  then  (2.1a)  and  (2.1b)  have  at  least 

m     n 

one  solution  in  (0,  a  )  and  the  number  of  solutions  is  odd. 

c 


Proof;   A  solution  takes  place  when  B,(a)  =  62(a).   Define  5(a)  =  6A0.)    -   62(a) 

6(a)  is  continuous  and  differentiable  in  (0,  a  ).   Proposition  2  implies 

c 

5(a)  >  0  for  a  >  0  and  a  -  0,  and  Proposition  3  implies  6(a)  <  0  for 

a  -  a  >  0  and  a  -  a  .   This  guarantees  existence  of  at  least  one  solution 
c  c  ° 

a^   to  5 (a-)  =  0  in  the  open  interval  0  to  a  and  excludes  an  even  number 
of  zeros  of  6(a)  in  this  interval.  I 

The  theorem  provides  a  simple,  sample  based  sufficient  condition  for 

existence  of  a  solution;  i.e.  A  and  A  are  sample  statistics. 

m      n 

An  alternative  formulation  of  the  problem  of  demonstrating  existence 
and  uniqueness  of  a  solution  to  the  equations  studied  in  the  preceding 
subsection  is  to  study  the  character  of  solutions  to 

G(a)   =   F^(a  +  S^M.)    -   ¥^(a)  (2.8) 

where 

n  -xA. 

F  (x)   =   Z     A./(l-e   h,  (2.9) 

j=l  J 

m  -xA. 

F„(x)   =   Z  A./(l-e   ^),  (2.10) 

^        3=1  J 

and  S„(a)  is  defined  implicitly  by  (2.1b);  i.e.  with 

n  -xA. 

F,(x)   =    Z        A,/(i-e   ^),  (2.11) 

^        j=Tir^l  ^ 

6„(c()  is  the  value  of  6  satisfying 


F^(a)   =  mA  +  F,(B) 
2  m    H 


(2.12) 


13 


The  equivalence  becomes  transparent  upon  differentiating  G  and 
both  sides  of  (2.12)  with  respect  to  a.   Then 

^nf    ^  dS-(a)   3F     dF 

G-Ca)   =   ^^^'^^   =   (1  +  — ^ )  -^ (2.13) 

^  ^^'  da       ^     da   '    3a    da 

and  via  the  definition  of  6, (a) , 

dS,(a)     dF„  3F- 

1  +  _J =  — ^/-^  ,  (2.14) 

■^     da       da  '  3a  ' 

so  that 

3F,    d6_(a)   d8.(a) 


It   has    been   sho^-n    that   G(0)    =  \  Tn[A^   -   A    ].      When  A      <   A      a   solution 

^     n    m  n    m 

in  (0,  a  )  exists  (Theorem  1) ,  so  demonstration  that  at  any  zero  of  G(a) 
occurring  in  (0,  a  )  G'(a)  >  0  is  sufficient  to  establish  uniqueness  of 
the  solution.   (If  G(0)  <  0,  G'(a)  at  the  left-most  a  satisfying  G(a)  =  0 
must  be  positive.   If  G'(a)  >  0  at  any  a  e  (0,  a  )  satisfying  G(a)  =  0, 
then  the  solution  is  unique) . 


14 


3.      PROPERTIES   OF   THE  PAIR    (1.3a)    AND    (1.3b) 


We    turn  now   to    the  sy-mmetric   pair 

n  A^  m  A. 

,^.  -(a+6)A,  "      .^^  -aA.  ^^-^^ 

k=l  Ic  1  =  1    ,  1 

1-e  •'        1-e        -* 


and 


n  A^  m  A. 

T.     .    .gs  ,  =        Z  ^— r-  (3.2) 

,     ^  -(a+g)A,  .    ,  -aA. 

k=l    ,  Tc  -1  =  1  ,  J 

1-e  -J  1-e        -^ 

with  0  <_  6  <  1.  In  what  follows  we  shall  assume  that  A.  >  0,  j=l,2,.,.,n 

and  bounded. 

Our  first  observation  is  that  a  solution  to  (3.1)  and  (3.2)  for 

unbounded  B  does  not  exist.   If  6  -<■  o°,  (3.1)  and  (3.2)  become 

n         m    A. 

E   A.   =    E  ^— —  ■  (3.3) 

and 

n    ~      m     A. 

E   A.   =   Z   ^.  (3.4) 

A  necessary  condition  for  a  unique  solution  to  (3.3)  and 

(3.4)  is  that  they  be  redundant.   Since  (3.3)  and  (3.4)  are  in  general  not 
redundant,  the  solution  to  (3.3)  will  in  differ  from  that  for  (3.4),  so  no 
unique  value  of  a  may  solve  both  (3.3)  and  (3.4).   Consequently,  in  general, 
no  solution  to  (3.1)  and  (3.2)  of  the  form  (a, 6)  =  (a   ^o)  exists. 

In  addition,  the  solution  in  a  to  (3.3)  is  bounded,  so  we  conclude 
that  if  a  solution  (a  ,6„)  to  (3.1)  and  (3.2)  exists,  both  a_  and  g   must  be 
bounded. 


15 


We  next  examine  the  behavior  of  the  symmetric  pair  in  a  neighborhood 
of  the  origin  for  an  illustrative  value  —  of  c .  A  Laurent  expansion  of  the 
right-  and  left-hand  sides  of  (3.1)  and  (3.2)  yields 


n   ~  m 


a  +  g   a 


(3.5) 


and 


-1-  z  A'     =  i  E   A.%  (3.6) 

«  +  S  j=l  J      ^  j=l  J 

so  that  a  solution  at  (a, 6)  =  (0,0)  may  exist  only  if 

1   "    _w      1   ™   -1- 

—     Y.A.^     =     —     ZA^^.a.   very  special  constraint  on  values  of  A  ,  .  .  .  ,A  . 

"  j  =  l  J      ™  k=l  ^  in 

By  keeping  one  more  term  in  the  series  expansion  of  the  form  (3.5) 

and  (3.6)  and  solving  the  resulting  equations,  a  lower  bound  can  be 

established  on  possible  solutions  to  the  symmetric  pair.   However,  the 

approximate  solution  so  obtained  does  not  provide  any  information  about 

rates  of  convergence  of  numerical  methods  for  computing  solutions.   To 

this  end  define  the  averages 


B   =  7  I  A'f  a  =   m,n  (3.7) 

C.   =  7  Z  A.'  I  =   m,n,  (3.8) 

j  =  l  -^ 

and   approximate    (3.1)    and  (3.2)    by 

_4_^+   1        A        ^  im  +  im  A  (3.9) 

a  +  B2  n  a  2  m 

and 

__n_  C     +inB  S^c+imB.  (3.10) 

a+Sn2  n  am2  m 
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An  approximate  solution  is  then 

2m(C   -  C  ) 
^   =  -B "^  _   _ — -  (3.11) 

n[A  C   -  B  ]  -  in[A   C   -  B  ] 
n  n    n        m  n    m 

and,  as  with  B^(a)  (cf,  (2.5)), 

n  -  m  (3.12) 


a, 


m 
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4.   COMPUTATION  METHODS 


Consider  the  system  of  transcendental  equations 

"  -1    "  -1 

F  (a,6)  i    Z      A.[l-exp(-{a+6}A.)]    -  Y.      A.  [  l-expC-aA. )  ]    =0     (4.1) 

j=l   J  ^       j=l   J  J 

tn  ^    m         n  , 

F  (a,6)  H    E   A.[l-exp(-aA.)]~   -   Z   A.  -    I        A.  [l-exp(-3A. ) ]~   =  0 
j  =  l   J  ^       j=l   J    j=m+l   -^  ^ 

(4.2) 

"    1^  _i     "^    1,  _1 

F  (a,B)  =    E   A.^[l-exp(-{a+6}A.)]    -  Z      A.^[l-exp(-aA.)  ]    =0     (4.3) 
->         j  =  l   J  J       1=1-"  -^ 

From  (4.1),  (4.2)  and  (4.3),  one  obtains  two  independent  pairs.   The 

asymmetric  pair  (4.1)  and  (4.2)  and  the  symmetric  pair  (4,1)  and  (4.3). 

As  pointed  out  earlier,  each  of  the  sums  in  (4.1),  (4.2),  and  (4.3)  has 

a  pole  at  the  origin  (a, 3)  =  (0,0),  and  lira  F('i,3)  =  0  as  ti,  6  ^  0. 

The  range  of  possible  solutions  to  FA~x,i)    =   0  and  F^(a,3)  =  0  with  0  _<  S  j^  °° 

restricts  the  range  of  a  to  0  <^  oi  <_  a   =  ACRIT .   Furthermore,  the  fact  that 

3  can  be  unbounded  opens  up  the  possibility  of  exponential  underflows. 

Accounting  for  underflows  may  in  turn  give  rise  to  inaccuracies  in  numerically 

computed  solutions,  so  care  must  be  exercised  in  implementing  a  numerical 

scheme  to  solve  the  pairs  F  (a,S)  =  0,  F  (a,B)  =  0,  and  F  (a,S)  =  0,  F-(a,6)  =  0. 

n 
Since  we  have  an  arbitrary  choice  of  scale,  we  chose  to  set   E   A.  =  1. 

j=l   J 
The  computer  program  for  solving  both  symmetric  and  asymmetric  pairs  begins 

by  solving 

n        m 

Z   A.  -   Z   A.[l-exp(-aA.) ]"   =  0  (4.4) 

j  =  l   ^    3=1   ^  ^ 

for  a   =  ACRIT.   To  avoid  exponential  underflows  e     is  approximated 

by  zero.   This  protection  is  used  throughout  the  entire  numerical  scheme. 
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-12    -12 
To  eliminate  poles  at  the  origin,  a  lower  limit  of  (10    ,10   )  is  set 

on  (a, 8)  and  an  upper  limit  of  ACRIT-10     is  set  on  a. 

Once  ACRIT  is  computed  by  the  use  of  a  modified  regula   falsi  (MRF) 

routine  (GTRANS) ,  equation  (4.2)   is  solved  for  a  given  a  by  a  second  MRF 

routine  (GMRF)  to  yield  6  as  a  function  of  a.   This  6(a)  is  substituted 

in  equation  (4.1)  and  GTRANS  is  used  to  find  the  desired  oi=a  ,  which  is 

then  substituted  back  into  (4.2)  to  yield  6(a).   The  range  allowed  for 

-12         -10  -12   70 

a  is  (10    , ACRIT-10    )  and  for  6  is  (10   ,10   ).   Since  this  range  is 

large,  the  process  is  repeated  with  ae  (a^-1,  ot  +1)  ,  6  e  (6^.-1,  S„+l)  to 

-14 
narrow  the  range  on  which  the  regula  falsi  iteration  takes  place  with  10 

tolerance.   As  an  accuracy  check,  a  two-dimensional  Newton-Raphson  (N-R) 

scheme  is  implemented  utilizing  (a  ,6„)  as  the  starting  solution  with  a 

-14 
tolerance  of  10    .  \-n\en   the  results  from  both  methods  agree  to  desired 

accuracy,  the  solution  (o.~^,    S>    )    is  accepted    and   estimates  N  of  N  and 

R  of  R  are  computed. 

This  procedure  was  used  to  solve  both  symmetric  and  asymmetric 

pairs  for  400  Monte  Carloed   successive  samples  (cf.  section  5).   In 

most  cases  it  works  well.   However,  there   are  cases  for  which  the  GMRF 

and  GTRANS  equation  solvers  do  not  bracket  the  roots.   Then  an 

initial  guess  of  ACRIT/2  is  used  for  a,  GTRANS  finds  the  corresponding  6, 

and  the  2-dimensional  N-R  scheme  is  utilized  with  this  initial  guess. 

This  procedure  worked  quite  well  for  the  asymmetric  pair  but  failed  several 

times  for  the  symmetric  pair.    In  each  such  instance  a  solution  was  found  bv 

brute  force:   [0,  a^]  was  divided  into  30,000  subintervals  and  the  initial  root 

bracketing  interval  found  by  identification  of  change  of  sign  of  values  of 

6-^(a)    -  B^(a)  ,    6^(a)  a  solution  to  F  (a,  6)  =  0  for  given  a,  and  6^(a)  a 

solution  to  F  (a,  6)  =  0  • 
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A  unique  solution  to  the  asymmetric  pair  was  found  in  each  of  six 
cases  where  the  sufficient  conditon  A   >  A   is  not  satisfied. 

The  nature  of  the  difficulties  experienced  in  computing  solutions  is 
reflected  in  the  character  of  the  graph  of  6^(a)  and  Q^M    vs.  a.   For  many 
cases,  I  6  (ct)  -  8,  (a)  |  is  small  in  value  on  the  interval  (Oja^),  but  grows 
rapidly  as  a  ->■  a  . 


[Figure  4.1  Here] 


20 


Figure    4.1. 


1.25 


g(a)   .75 


6i(a) 
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5.   MONTE  CARLO  STUDY  OF  SOLUTIONS 

In  order  to  study  the  comparative  behavior  of  solutions  to  both 
the  symmetric  and  asymmetric  equation  pairs,  a  Monte  Carlo  simulation 
consisting  of  400  successive  samples  drawn  from  a  fixed  finite  population 
was  performed.   Samples  were  generated  from  a  population  of  N=170  elements 

and  magnitudes  A  , ...,A  _„,  with  A^  being  the  (k/N+l)st  fractile  of  a 

2 
lognormal  population  with  parameter  iu,o   )  =  (.83,  1.62).   This  choice  of 

2 
values  for  y ,  a  ,  and  N  matches  estimates  provided  by  Meisner  and  Demirmen 

(1980)  for  a  segment  of  the  North  Sea,  based  on  n=58  discoveries.   These 


^ 


particular  values  of  N,  y,  and  a~    correspond  to  R  =  T.      A.  =  818.   For 

j  =  l   J 
each  of  400  successive  samples  of  size  n=58,  a  solution  to  the  symmetric 


pair  (1.3a)  and  (1.3b)  and  a  solution  to  the  asymmetric  pair  (1.3a)  and 
(1.3c)  were  computed  by  the  methods  described  in  section  4.  Each  sample 
was  split  at  m=38. 

Tables  and  graphs  describing  properties  of  Monte  Carloed  sampling 
distributions  of  N  and  R  for  this  particular  case  are  presented  in  section  5.1. 
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5.1   SAMPLING  PROPERTIES  OF  R  AND  OF  N 


Summary  displays  are  in  the  form  of: 

(1)  Parallel  boxplots  of  N   and  of  R  values  generated  by 
symmetric  pair  and  by  asymmetric  pair  solutions. 

(2)  Quantile-quantile  plot  of  R-R  quantiles  vs.  unit 
normal  quantiles  for  the  symmetric  pair  and  for 
the  asymmetric  pair. 

(3)  Quantile-quantile  plot  of  N-M  quantiles  vs.  unit 
normal  quantiles  for  the  symmetric  pair  and  for  the 
asymmetric  pair. 

(4)  Measures  of  location  and  of  spread  for  N   and  for 
R  values . 

(5)  Scatterplot  of  N  values  generated  by  the  symmetric 
pair  vs.  N  values  generated  by  the  asymmetric  pair. 
A  similar  scatterplot  for  R  values. 

Some  tentative  conclusions  about  the  behavior  of  estimates  generated 

by  moment  matching  as  described  in  earlier  sections  emerge.   When  the 

particular  finite  population  used  in  this  Monte  Carlo   experiment  is 

successively  sampled  with  sample  size  n=58  and  a  split  at  m=38: 

(I)   Estimators  of  N  and  of  R  derived  by  solving  either 
the  symmetric  or  the  asymmetric  pair  of  equations 
are  negatively  biased. 

The  parallel  boxplots  for  N   and  R  values  (Figures  5.1  and  5.2),  the 

quantile-quantile  plots  for   N-N    and   R-R   values  (Figures  5.3  to  5.6), 

and  the  summary  statistics  in  Table  5.1  display  this  bias  in  different  ways. 

(II)   The  sampling  distributions  for  both  N   and  R  values 
generated  by  solving  the  symmetric  pair  exhibit 
larger  spreads  and  more  (right  tail)  outliers  than 
the  corresponding  distributions  generated  by  solving 
the  asymmetric  pair. 
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The  quantile-quantile  plots  of  Figures  5.3  and  5.4  for  R-R  values  exhibit 

fatter  than  normal  right  tails.   Figure  5.4,  in  particular,  is  a  display  of 

R-R  values  for  the  symmetric  pair,  which  exhibits  substantial  right  tail 

skewedness.   If,  however,  the  eight  largest  of  the  four  hundred  values  plotted 

are  disregarded,  the  graph  defined  by  the  remaining  points  appears  very  close 

to  a  straight  line. 

(Ill)   While  extreme  right  tails  (above  .98  quantiles) 
of  distributions  of  R-R  values  are  fatter  than 
normal,  these  distributions  appear  to  be  close  to 
normal  in  shape  elsewhere. 

In  contrast,  a  visual  examination  of  Figures  5.5  and  5.6  shows  that: 

(IV)   Distributions  of  N-N  generated  by  solving 

either  the  asymmetric  pair  or  the  symmetric 
pair  are  decisively  not  normal  in  shape. 
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TABLE  5 . 1 

SUMMARY  MEASURES  FOR  MONTE  CARLOED  SAMPLING 
DISTRIBUTIONS  OF  ESTIMATORS  N  AND  R 


SYMT^ETRIC  PAIR 


ASYMMETRIC  PAIR 


Measures  of  Location 


N 


Median 

146 

764 

Mean 

151 

765 

Lower  quartile 

134 

737 

Upper  quartile 

163 

791 

Minimum 

104 

651 

Maximum 

255 

925 

N 
145 
148 
134 
160 
108 
232 


R 

762 
760 
738 
783 
651 
852 


Measures  of  Spread 

Standard  Deviation  22.3  41.4 

Interquartile  Range  29  54 

Range  151  274 


19.3     33.1 

2  6       4.5 

124      201 
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5.2   SAMPLING  PROPERTIES  OF  SOLUTIONS  TO 
ASYMMETRIC  AND  SYMMETRIC  PAIRS 


Properties  of  the  sampling  distribution  of  solutions  to  symmetric 
and  to  asymmetric  pairs  are  displayed  in  a  fashion  similar  to  that  for 
N  and  for  R  values.   Table  5.2  presents  some  summary  statistics. 

Examination  of  parallel  boxplots  for  a  and  for  6  values  (Figures 

5.7  and  5.8)  shows  that 

(I)   The  distribution  of  6  values  for  the  symmetric 
pair  is  spread  over  a  much  larger  range  than 
corresponding  B  values  for  the  asymmetric  pair. 

Iifhile,  in  accord  with  asymptotic  theorv,  auantiles  of  i  values  and 

quantiles  of  B  values  plot  as  straight  lines  against  unit  normal  quantiles 

throughout  a  range  of  -2.0  to  +2.5  (Figures  5.9  to  5.13), 

(II)   Extreme  left  tails   (.02-. 025  fractiles  and  smaller) 
of  distributions  of  a   and  of  6  values  clearly  deviate 
from  normality. 

This  finding  accords  with  deviations  from  normality  seen  in  the  right  tails 

of  s£impling  distributions  for  N  and  for  R  values.   For  ot  values  generated 

by  the  asymmetric  pair,  a  quantile  transformation  exp{a/40}  of  a  (Figure  5.12) 

appears  slightly  closer  to  normal  than  quantiles  of  a.    (Figure  5.11). 

Figure  5 . lA  is  a  scatterplot  for  symmetric  vs.  asymmetric  pair  i  values 
and  Figure  5 . 15  a  similar  scatterplot  for  B  values.   That  a   values  generated 
by  symmetric  and  by  asymmetric  pairs  are  positively  correlated  (sample 
correlation  -  .652)  is  obvious.  The  same  is  true  for  6  values  (sample 
correlation  =  .751  ). 

More  interesting  is  the  heteroscedasticity  displayed  by  symmetric  vs 
asymmetric  pair  a  values.   This  feature  of  Figure  5.14  can  be  captured  by 
fitting  the  ratio  a^-'  /a,-'   of  the  j    monte  carloed  sample  symmetric  pair  a 

b         A 
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value,  a    ,  to  the  corresponding  asvnmietric  pair  a   value,  a    ,  with 
S  A 

a  model  of  the  form  "constant  plus  error".   To  wit, 

(J) 
-(TJ     =   e  +  e^J\    j=l,2,...,400.  (5.1) 

We   may    interpret    (5.1)    as    a  variance   stabilizing    transformation   of    a 

linear  model   of    the    form  a      =    ga      +   q   with   observed   values    of   error    term  r\    =   ql   z 

b      A  A 

exhibitinK  increasing  spread  as  the  value  of  'i.  increases.   The  behavior 

A 

of  the  equation  systems  leads  us  to  expect  that  6  =  1.0. 

and  that  when  6  =  1.0  the  empirical  distribution  of  e^"*  values  is 

approximately  normal  with  mean  zero. 

1   ^00    , 

Our  expectations  are  borne  out:   the  mean  -— --   Z    a  "'  /u  -''^  =  6  =  1  004 

400  .^-,    S    A 

and  the  graph  in  Figure  5.16  of  quant;iles  of  the  empirical  distribution  of 

residuals  ;     =  [a    /'^,   ]  -  6  versus  unit  normal  quantiles  appears 

S    A 

reasonablv  close  to  a  straight  line. 

A  slight  improvement  is  afforded  by  fitting  a  model 

^(j) 

-^     =   Y   +  Y  z'^J^  +  w^j\    1  =  1,2 400  (5.2) 

(j)       J-     - 

\ 
\ 

with  a^   =  7-r—   E   21.    and  Z     =  a    -  a..   Upon  doing  so  we  find 
A     400  .^i   A  A      A    '        ^ 

Y^  =  1.004  and  y   =  .00667  (t  statistic  =  -7.2,  standard  error  .00093),  but 
the  addition  of  the  linear  term  in  Z  explains  only  about  11.5%  of  the  total 
variance  of  observed  values  of  the  ratio  of  a  values.   A  graph  of  quantities 
of  the  empirical  distribution  of  residuals  w    versus  unit  normal  quantities 
given  in  Figure  5.17  shows  this  slight  improvement. 
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TABLE  5.2 

SUMMARY  MEASURES  FOR  MONTE  CARLOED  SAMPLING 
DISTRIBUTIONS  OF  ESTIMATORS  a  AND  3 

SYMMETRIC  PAIR         ASYMMETRIC  PAIR 


Measures  of  Location     a. 

_6 

a 

B_ 

Median 

43.59 

42.40 

43.65 

43.97 

Mean 

43.60 

42.69 

43.49 

43.83 

Lower  quartile 

41.42 

39.02 

41.48 

42.94 

Upper  quartile 

45.75 

46.68 

45.54 

44.90 

Minimum 

33.80 

22.46 

31.82 

37.88 

Maximum 

53.05 

58.13 

51.24 

47.34 

Measures  of  Spread 


Standard  Deviation 

3.25 

5.97 

Interquartile  Range 

4.33 

7.66 

Range 

19.25 

25.67 

2.99  1.50 
4.06  1.86 
9.42     9.45 
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